Fast Generation of a Sequence of Trained and Validated Feed-Forward Networks
نویسندگان
چکیده
In this paper, three approaches are presented for generating and validating sequences of different size neural nets. First, a growing method is given along with several weight initialization methods, and their properties. Then a one pass pruning method is presented which utilizes orthogonal least squares. Based upon this pruning approach, a onepass validation method is discussed. Finally, a training method that combines growing and pruning is described. In several examples, it is shown that the combination approach is superior to growing or pruning alone. Introduction According to the structural risk minimization (SRM) principle, a sequence of learning machines of increasing size should be produced, and trained as well as possible. The machine with the smallest validation error is the best compromise between the training error and the complexity of the network. If this principal is followed with neural nets, the final training error as a function of the number of hidden units, Ef(Nh), will be monotonically nonincreasing. Sequences of networks are produced though growing methods and pruning methods. In growing methods (Delashmit 2003), one can design a set of different size networks in an orderly fashion, each with one or more hidden units than the previous network (Fahlman et. al. 1990) (Chung & Lee 1995). A drawback of growing methods is that the network can get trapped in local minima and they are also sensitive to initial conditions. In pruning methods, a large network is trained and then less useful nodes or weights are removed (Hassibi et. al. 1993) (LeCun et.al. 1990) (Sakhnini, Manry, & Chandrasekaran 1999). Some pruning algorithms remove less useful hidden units using the Gram-Schmidt procedure as reported by Kaminsky and Strumillo (1997) for Radial Basis Functions and Maldonado et.al. (2003) Compilation copyright © 2006, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. for the multilayer perceptron (MLP). Unfortunately, if one large network is trained and pruned, the resulting error versus Nh curve is not minimal for smaller networks. In other words, it is possible, though unlikely, for each of the hidden units to be equally useful after training. In this paper, we investigate three approaches for generating and validating sequences of different size neural nets. First, a growing method is described, along with analyses of weight initialization methods. Then a pruning method is presented which requires only one pass through the training data. A one-pass validation method is presented, which is based upon pruning. Next, a method that combines growing and pruning is presented. Results show that this third approach overcomes the shortcomings of growing or pruning alone. Multilayer Perceptron Figure 1 depicts feed-forward MLP, having one hidden layer with Nh nonlinear units and an output layer with M linear units. For the p pattern, the j hidden unit’s net function and activation are
منابع مشابه
Handwritten Character Recognition using Modified Gradient Descent Technique of Neural Networks and Representation of Conjugate Descent for Training Patterns
The purpose of this study is to analyze the performance of Back propagation algorithm with changing training patterns and the second momentum term in feed forward neural networks. This analysis is conducted on 250 different words of three small letters from the English alphabet. These words are presented to two vertical segmentation programs which are designed in MATLAB and based on portions (1...
متن کاملSTRUCTURAL DAMAGE DETECTION BY MODEL UPDATING METHOD BASED ON CASCADE FEED-FORWARD NEURAL NETWORK AS AN EFFICIENT APPROXIMATION MECHANISM
Vibration based techniques of structural damage detection using model updating method, are computationally expensive for large-scale structures. In this study, after locating precisely the eventual damage of a structure using modal strain energy based index (MSEBI), To efficiently reduce the computational cost of model updating during the optimization process of damage severity detection, the M...
متن کاملNumerical treatment for nonlinear steady flow of a third grade fluid in a porous half space by neural networks optimized
In this paper, steady flow of a third-grade fluid in a porous half space has been considered. This problem is a nonlinear two-point boundary value problem (BVP) on semi-infinite interval. The solution for this problem is given by a numerical method based on the feed-forward artificial neural network model using radial basis activation functions trained with an interior point method. ...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملPrediction of Permanent Earthquake-Induced Deformation in Earth Dams and Embankments Using Artificial Neural Networks
This research intends to develop a method based on the Artificial Neural Network (ANN) to predict permanent earthquake-induced deformation of the earth dams and embankments. For this purpose, data sets of observations from 152 published case histories on the performance of the earth dams and embankments, during the past earthquakes, was used. In order to predict earthquake-induced deformation o...
متن کاملStatic Security Constrained Generation Scheduling Using Sensitivity Characteristics of Neural Network
This paper proposes a novel approach for generation scheduling using sensitivitycharacteristic of a Security Analyzer Neural Network (SANN) for improving static securityof power system. In this paper, the potential overloading at the post contingency steadystateassociated with each line outage is proposed as a security index which is used forevaluation and enhancement of system static security....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006